Enabling What-if Explorations in Systems (CMU-PDL-07-103)

نویسنده

  • Eno Thereska
چکیده

With a large percentage of total system cost going to system administration tasks, ease of system management remains a difficult and important goal. As a step towards that goal, this dissertation presents a success story on building systems that are self-predicting. Self-predicting systems continuously monitor themselves and provide quantitative answers to What...if questions about hypothetical workload or resource changes. Self-prediction has the potential to simplify administrators’ decision making, such as acquisition planning and performance tuning, by reducing the detailed workload and internal system knowledge required. Self-prediction has as the primary building block mathematical models, that, once built into the system, analyze past, and predict future behavior. Because of the traditional disconnect between systems researchers and theoretical researchers, however, there are fundamental difficulties in enabling existing mathematical models to make meaningful predictions in real systems. In part, this dissertation serves as a bridge between research in theory (e.g., queuing theory and statistical theory) and research in systems (e.g., database and storage systems). It identifies ways to build systems to support use of mathematical models and addresses fundamental show-stoppers that keep models from being useful in practice. For example, we explore many opportunities to deeply understand workload-system interactions by having models be first-class system components, rather than developing and deploying them separately from the system, as is traditionally done. As another example, lack of good measurement information in a distributed system can be a show-stopper for models based on queuing analysis. This

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Towards Self-Predicting Systems: What if You Could Ask “What-if”? (CMU-PDL-05-101)

Today, management and tuning questions are approached using if...then... rules of thumb. This reactive approach requires expertise regarding of system behavior, making it difficult to deal with unforeseen uses of a system’s resources and leading to system unpredictability and large system management overheads. We propose a What...if... approach that allows interactive exploration of the effects...

متن کامل

Informed Data Distribution Selection in a Self-predicting Storage System (CMU-PDL-06-101)

Systems should be self-predicting. They should continuously monitor themselves and provide quantitative answers to What...if questions about hypothetical workload or resource changes. Self-prediction would significantly simplify administrators’ decision making, such as acquisition planning and performance tuning, by reducing the detailed workload and internal system knowledge required. This pap...

متن کامل

DiskReduce: RAID for Data-Intensive Scalable Computing (CMU-PDL-09-112)

Data-intensive file systems, developed for Internet services and popular in cloud computing, provide high reliability and availability by replicating data, typically three copies of everything. Alternatively high performance computing, which has comparable scale, and smaller scale enterprise storage systems get similar tolerance for multiple failures from lower overhead erasure encoding, or RAI...

متن کامل

MEMS-Based Storage Devices and Standard Disk Interfaces: A Square Peg in a Round Hole? (CMU-PDL-03-102)

MEMS-based storage devices (MEMStores) are significantly different from both disk drives and semiconductor memories. The differences motivate the question of whether they need new abstractions to be utilized by systems, or if existing abstractions will be sufficient. This paper addresses this question by examining the fundamental reasons that the abstraction works for existing devices, and by s...

متن کامل

An Analysis of Traces from a Production MapReduce Cluster (CMU-PDL-09-107)

MapReduce is a programming paradigm for parallel processing that is increasingly being used for data-intensive applications in cloud computing environments. An understanding of the characteristics of workloads running in MapReduce environments benefits both the service providers in the cloud and users: the service provider can use this knowledge to make better scheduling decisions, while the us...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015